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ABSTRACT 

The first in a series of reports by the Far Best 
Laboratory for Educational Research and Development, - this report 
demonstrates- the positive relationship between reduced class size a'nd 
pupil achievement. The researchers collected about 80 studies that 
yielded over 700 comparisons of the achievement of smaller and larger 
classes. The results showed that as class size- increases^ achievement 
decreases. For example, the difference in being taught in a class of 
20 versus a class of 40 shows an ad'vantage of 6 percentile ranJcs. The 
relationship between class size and achievement is slightly stronger 
at the secondary level, but it does not differ appreciably across 
different school subjects, levels of pupil IQ, or several other 
demographic features of classrooms. The report suggests that schools 
cannot afford the consequences of maintaining large classes all the 
time and must find ways to finance smaller classes for some pupils or 
for all -pupils for part of the school'-day-. (Authcr/LD) 
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PREFACE 



This is the first in a series of reports to be published by the Class Size 
and Instruction Project, of the Far West Laboratory. A second meta-analysis, 
also under the direction of Drs. Gene V Glass and Mary Lee Smith, will be focused 
on the relationship of class size and classroom processes, teacher satisfaction, 
and pupil affect. It is scheduled for publication in early 1979. In the spring 
of 1979, a group of policy-makers will be commissioned to react to the meta- 
analyses. Information on obtaining these documents as they become available plus 
other publications emanating from the Class Size and Instruction Project may be 
obtained by contacting me, Dr. Leonard S. Cahen, at the address below. 

Drs. Glass and Smith have demonstrated that reduced class size and pupil 
achievement are indeed associated. Their search has uncovered many studies that 
have, not been examined in earlier investigations of class size. The class size 
issue begs in vain for a !:^imple answer to the complex question, "What is the 
ideal class size?" The research synthesis reported here does demonstrate the 
trend: very small achievement advantages are expected when small reductions are 
made in class size in the 20-30 pupil ranga and large advantages when class size 
is reduced below 20. The reader must wrestle with value judgments. Are the 
advantages worth the cost? In a country that prides itself on quality education 
for all, the answer might be straightforward: .schools cannot afford the conse- 
quences of maintaining large classes all the time, ancl ways must be found to 
finance smaller classes, at least for some pupils or for all pupils for part of 
the school day. 
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SUMMARY 

Research on the relationship between school class-size and academic achieve 
ment is old, huge and widely believed to be inconclusive. Previous reviews of 
the evidence^have been overly selective and insufficiently quantitative. Timid 
qualifications were offered where bold generalizations were possible. In the 
summer of 1978, the New York Times gave front-page coverage to a study published 
by Educational Research Services Inc. (Porwell, 1978). This organization is 
funded jointly by the American Association of School Administrators, the Council 
of Chief State. School Officers, and several other professional administration 
groups. The "Por'well: Report" staggered visibly under the weight of the research 
data and eventually arrived at the following conclusions sad for teachers to 
behold: 

Research findings on class size to this point document 
repeatedly that the relationship between pupil achievement 
and class size is highly complex. 

There is general consensus that the'research findings on the . 
effects of class, size on pupil achievement across all grades 
are contradictory and inconclusive. 

Existing research findings do not support the contention 
that smaller classes will of themselves result in greater 
academic achievement gains for pupils. 

(Porwell, 1978, pp. 68-69) 

The research reported herein contradicts the conclusions of the Porwell 
Report. Indeed, it established clearly that reduced class-size can be expected 
to produce increased academic achievement. In pursuing this conclusion, we 
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discovered many of the reasons why previous research reviewers lost their way in 
the forest of data and failed to find a defensible generalization. 

We collected nearly 80 studies of the relationship between class-size and 
achievement. These studies yielded over 700 comparisons of the achievement of 
smaller and larger classes; these comparisons rest on data accumulated from 
nearly 900,000 pupils of all ages and aptitudes studying in all manner of school 
subjects. Using complex methods of regression analysis, the 700 comparisons were 
integrated into a single curve showing the relationship between class-size and 
achievement in general. Thi<^ curve revealed a definite inverse relationship 
between class-size and pupil learning. Similar curves were derived for a variety 
of circuiMStances hypothesized to alter the relationship between achievement and 
class-size. Virtually none pf these special circumstances altered the basic 
relationship; not grade level, nor subject taught, nor ability of pupils. Cm]y 
one factor substantially affected the curve, viz., whether the original sTudy 
controlled adequately (in the Experimental sense) for initial differences among 
pupils and teachers in smaller and larger classes. The nearly 100 comparisons of 
achievement from the well -controlled studies thus form the basis of our conclu- 
sion about how class-size is related to academic achievement. The most accurate 
representation of this relationship is 6. curve derived from the 100 comparisons 
from well -control led studies. This curve appears in tne Figure below. As class- 
size increases, achievement decreases. A pupil, who would score at about the 
83rd percentile on a national test when taught individually, wou"id score at about 
the 50th percentile when taught in a class of 40 pupils. The difference in being 
taught in a class of 20 versus; a class of 40 is an a'dvantage of 6 percentile 
ranks. The major benefits from reduced class-size are obtained as size is 
reduced below 20 pupils. : 
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figure l-telationsinYijetjieen adiiwement and class-size. (Data inteorated across 

approxiMtely lOttcoiparisoK froi studies exercising good experiMntal control.) 



META-ANALYSIS OF RESEARCH ON CLASS-SIZE AND ACHIEVEMENT 

s 

There is no point in recording the obvious about class-size: that teachers 
worry about it more than nearly anything else, that administrators want to 
increase it, that it is economicary important, and the like. The problem with 
class-size is the research. It is_ unclear. It has variously been read as 
supporting larger classes, supporting smaller classes, and supporting nothing 
but the need for better research. Review after review of the topic has dissolved 
into cynical despair or epistemological confusion. The notion is wide-spread 
among educators and researchers that class-size bears no relationship to achieve- 
ment.. It is a dead issue in the minds of most instructional researchers. To 
return to the class-size literature in search of defensible interpretations and 
conclusions strikes many as fruitless. The endeavor is surrounded by a faint 
aroma of Chippendale, which it resembles in other respects: unwieldy and antiquef. 

One. could document the confusion, in previous reviews of research on the 
class-size ani achievement relationship. It would be simple to quote reviewer X 
claiming that large' classes are better, reviewer Y to the effect that small 
classes are better, and reviewer Z that neither is better. But to do so would 
only embarrass others and add nothing to one's appreciation of the complexity of 
the research. The problems with previous reviews of the class-size literature 
are several: (1) literature searches were haphazard and often overly selective; 
dissertations were avoided, as a rule, and few reviewers sought out large ' 
archives of pertinent data; (2) reviews were typically narrative and discursive; 
the multiplicity of findings cannot be absorbed without quantitative methods of 
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reviewing*; (3) reviawers that attempted quantitative integration of findings made 
"several mistakes: (a) they used crude classifications of class-sizes; (b) they 
took "statistical significance" of differences far tooseriously; and (c) they 
lacked sufficiently sophisticated techniques bf integrating results. 

In the research reported here, an attempt was made to correct these short- 
comings and determine if the huge research literature on class-size and achieve- 
ment really was hopelessly^ confusing or if its message was merely buried in 
myriad results waiting to ^be coaxed out with more advanced methods of research 
integration. . t. . 

The Literature Search 

• The search for clasis-size studies was carried out in three places: (1) 
document retrieval and abstracting resources; (2) previous reviews of the class- 
size literature; and (3) the bibliographies of studies once founxl. The ERIC 
system and Dissertation Abstracts were searched completely on the key words 
"size," "class size," and "tutoring." The dissertation litera^re was covered as 
far back as 1900, and the fugitive educational research literature was covered 
from the mid-1960s to 1978. Of the many hundreds of doctoral dissertations 
scanned in Dissertation Abstracts , about thirty microfilm copies were purchased. 
About a dozen of these dissertations were incorporated;, the remainder dealt with 
non-achievement and process variables that will be covered in subsequent work. 
The journal literature on class-size was located in the traditional way; one or 
two current reviews of the research were found — the Ryan and Greenfield (1975) 
review and the comprehensive review by Lafleur, Sumner and Witton (1974) -- 
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were particularly comprehensive and helpful - t,,., d> deles cited were 
located, and the articles c''::ed in these articles were located in turn. 

Approximately 300 documents were obtained and read. One hundred-fifty of 
them were found to contain no usable data, i.e., no data whatsoever were reported 
on the comparison of small- and large-class achievement. About 70 studies 
examined the relationship of class-size to non-achievement outcomes and classroom 
process variables. Approximately 80 studies on the class-size and achievement 
relationship were included in this analysis. 

It is difficult to estimate what portion of the existing literature was 
captured by this search. Even though the corpus of 80 studies exceeds by 50 per- 
cent the most extensive reviews published to date — and these reviews are narra- 
tive and inconclusive — it is conceivable that less than half of all studies 
that exist on the topic were found. Some studies (credited to school districts) 
could not be located even after several phone calls and letters. Other studies 
were surely missed because of odd or nondescript titles. The dissertation search 
was conducted on key words such as "size," "class-size," anci "tutoring;" but'tlie 
words must appear in titles to be registered in the index to Dissertation 
Abstracts . (Fortunately, the ERIC system uses key words based on the contents of 
a paper and not titles alone.) Several studies found in the journal literature, 
by branching off existing bibliographies had neither "size" nor "class-size" in 
the title, evidence enough that several dissertations were missed because their 
titles lacked the key words. Still another complication concerns the use of 
class-size as an incidental variable in studies focused on other issues. There 
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are probably many such studies, and only a few u" ': n. '» , > lilies were 
located. 

The Texture of the Literature 

In what follows in this integrative analysis, one can easily lose touch with 
precisely what'kinds of research are being integrated. _ The statistics and graphs 
that represent the findings of this meta-analysis of class-size research will 
seem far removed from the original studies themselves. And, in a very real 
sense, what will be done fo^^ the sake. of arriving at general conclusions places 
the reader in benign jeopardy of losing qualitative and personal familiarity with 
the research. this section, the general texture of the class-size literature 
will be described, and a few studies typical of various eras will be reported., 

The research on class-size and its relationship to achievement falls into 
four stages: the pre-experimental era (1895-1920); the primitive experimental 
era (1920-1940); the large-group technology era (1950-1970); and the individual- 
ization era (1970-present) . The boundaries of the eras are not impenetrable, and 
even today an atavistic throwback to the 19th century will appear in a doctoral 
thesis. At each new stage, the sophistication of research methodology increased, • 
and the question of class-size and its effect on achievement was examined with 
different motives. One discerns in the narration accompanying the numbers the 
cult of efficiency of the early part of this century, the rising birth . rate of 
the post-war *40s, the advent of teaching technology in the '60s, and most 
recently the teacher labor movement combined with declining enrollments. What 
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was said about the data changed as new intern-'^ations served emerging purposes, 
even when the data changed littfe then^^- 

The first empirical study on educau,-, ocesses and their effects on 
achievement included an examination of the class-size question (Rice, 1902). No 
strong relationship of class-size to attainment was observed. But unfortunately. 
Rice reported virtually no numbers^ and' it is impossible todetermine now whether 
the relationship Rice found was genuinely small or whether it was moderately^ 
|large but only seemed small to Rice, who may have expected much more. Rice's 
study was followed by several similar analyses, on. new data collected between 1900 

^nd 1920. These studies are typified by their rugged non-experimental logic. A 

1. • '■ - ''-if ■ ' ■ . 

study by Cornman (1909) can serve as an example. 

• Cornman examined the promotion records for January 1909, in District No. 6, 
Philadelphia. Before the day of "social promotion," the passage from one grade 
to the next higher indicated adequate achievement at the lower grade. Cornman 
categorized classes into three groups: under 40 pupils, 40 to 49, 50 or more. 
The rate of promotion was calculated for each 't'lass-size category. At grade 3, 
88 percent of 400 pupils, in classes of 40 or fewer were promoted, 85 percent of 
1,300 pupils in classes size 40 to 49 were promoted, and 81 percent of 640 pupils 
were promoted in classes of over 50 pupils. Cornman also investigated "satis- 
factory conduct" ratings by teachers -in classes of different sizes. The discus- 
sion of results showed little sensitivity to questions of experimental control; 
such concerns were doubtless not wide-spread at the time. . ■ 

Beginning in the early 1920s, the class-size and achievement questioir was 
approaqhed with better methods. Studies began to appear that used matching of 



pupils in large and small classes on ability and achievement; content and methods 
were standardized in the two cla^^^-- •^rasionally the same to'^chers taught 
clasi:. , of both sizes. . ^ fo^- , ..rov ni and Beeson 5 < r-^ relationship 

between class-size ar^d achievement in grammar and English at the high-school 
ievel in Grand Junction, Colorado. In the Fall of 1922, three. Engl ish classes of 
44, 34, and 20 pupils were formed. Their Terman Group Test IQs were nearly 
identical at the first, second, and third quartiles". "After thoroughly estab- 
lishing our classes, our method of conducting the experiment was merely to pro- 
ceed with the year's work in the usual way, except that we found it necessary to 
depend rather more than usual on test grades, because the number of pupils in the 
large class made it impossible for each pupil to make many daily recitations each 
period" (p. 127). The experiment was run for nine weeks. Then the Starch . 
Grammar Test and Kirby Grammar Test were administered along with some specially 
designed classroom tests' on clauses. The findings slightly favored the two 
smaller classes over the class of 44. 

In the 1940s, class-size research went dormant when educational researchers 
went to war. It was revived along with the rest of the field in the 1950s and 
1960s. Researchers seemed intent on demonstrating, particularly a*t the college 
level, that lecture classes could be doubled or tripled in size without loss of 
effectiveness. At about the same time, massive empirical studies of education 
were undertaken- to inform national education policy: the Coleman study of 
equality of educational opportunity (1966); Project TALENT; the International 
Assessment of Education in mathematics and reading; and surveys of - government- 
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funded programs of compensatory education (Title I). These large empirical 
studies typically included, as incidental features, data on the relationship of 
class-size and -achievement. The study by Nelson (1959) is representative of the 
first kind of study to appear in the 'bu <ind '60s; the Coleman , i_ 2,1 . (^966) 
study is like many studies of the second type. 

In 1959, Nelson reported on a study of large-group college instruction. 
Four Instructors were involved, each teaehing. one large and one small section of 
elementary economics. The pupils in each instructor's classes were matched on 
major (e.g., business, engineering), level (freshman, sophomore), and sex. The 
course was taught three hours a week for a, semester. The class-sizes compared 
were 20 vs. 138, 16 vs. 141, 20 vs. 94, 20 vs.. 90, 17 vs. 5,09, 17 vs. 94^ 19 vs. 
85. A common final examination was administered to all 14 classes. Achievement 
outcomes were adjusted by covarying on students' pnior grade-point average. The 
means favored the larger classes by three one-thousandths standard deviation! 

The Coleman study is famous. Tens of thousands of pupils in grades 1, 3, 6, 
9, and 12 were surveyed. Achievement tests were administered and "school 
resources" were riieasured at the level of the school;, e.g., teachers' experience, 
use of special programs. Among these resource;variables was pupil/instructor 
ratio. The P/r ratio was correlated with pupil achievement. The correlations, 
were generally negative. When Mayeske et a^. (undated) partialed out three or 
four other variables which might have obliterated these correlations, the r's 
remained consistently negative. 
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The research relevant to class-size that appeared in the 1970s showed a con- 
cern for establishing the benefits of individualization. Experiments were per- 

formed that involved radically reduced instructional group sizes, one teacher 

. ■ ?i ■. • 

with two or three pupils. Studies of individual pupils taught by computer or 

machine have also become common; they were not considered in this integrative 

analysis since the particular concern here is with the processes of hums^n 

\instfuction. (For a meta-analysis of tutoring and computer-assisted instruction 
\ • • , ' • . - 

in mathematics that produced surprising findings, see Hartley, 1977.) An exper- 
iment typical of studies of radically reduced group size was condicted by Bausell 
et aV, (1972). . Pupils in grades 4 and 5 were randomly assigned to receive either 
individual tutoring on' exponential arithmetic for one hour across two days or 
instruction by rando.nly comparable* teachers for the same amount of time in a 
class of 25 puPiils. Instruction was a part of an on-going school program, A 
test designed to cover only the content of the instruction was administered to 
all pupils! Pupils in '^class-size 1" scored approximateJy.-one-half'Ttandard 

' deviation above pupjJ:sJn'-class'e^^^^^ on the achievement tests. 

. ' Methods ^ 

^ In this section, the methods are described by which the stuflies were coded 
and the quantitative findings, integrated, .. . . . 
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Defining the Field . . 

The problem of this meta-analysis is to determine what the available 
research proves about the relationship of class-size to achievement- Drawing 
boundaries around this top^o wa. simple compared to the difficulties encoun- 
tered in defining psychotherapy, for example (Smith and Glass, 1977). Con- 
ventional definitions of "achievement" seem scarcely to have cTianged over 
eighty years. "Class-size" can be described and quantified in severalj 
different ways, but it was relatively easy to select one approach. Definitions 
of class-size differ in terms of how close they are to the reality of the - 
chilcl'*s experience in the classroom. Some definitions, such as "Numerical 
Staff Adequacy," reflect the ratio of staff to pupils on a district-wide 
basis. Such definitions are relatively distant from the class room .__^^the-^-" 
other hand, withiji a conventional ^l^i^l^*^^^""'''^^^ instructors can 

bej)j^sen±,^thusnreciu^ the actual instructional group size for a particular 
student. Instructional group size is very close to the child's experience in 
the classroom. Because of an interest ip the classroom processes that pre 
sumably mediate the^^elationship of class-size to achievement, w? chose a 
definition which is close to classroom reality. In this review, "class-size" ' 
is defined as the ratio of pupils to instructors, -or instructional group 
size. In most studies, this was the same as tKe size of the classroom unit, 
but in some it was not. 
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Coding Cnaracteri sties of Studies , ' 

^he quantification of characteristics of studies permits the eventual 
statistical description of how properties of studies affect the principal find- 
ings. Such questions can be addressed as "How does the class-size and -achieve- 
ment relationship vary as a function. of age of pupils?" or "How does it vary 
between reading and math instruction?" The first step in coding studies is- to 
identify those properties of studies that might interact with the relationship 
between class-size and achievement. There is no systematic and logical procedure 
for taking this step. One simply reads a few studies from the literature of 
interest, talks with experts, and then makes a best guess; modifications can 
always be made later if needed. The best guesses as to which conditions might 
mediate the relationship fell into 'five broad categories: Study Identification, 
Instruction, Classroom Demographics, Study Conditions, and Outcome Variable. 
About 25 specific items fell into these categories. Some were more fruitful than 
others; several items were seldom reported in the research publications, A 
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c--linp <;hivt .as devi d ontj which t,^e infoniidtion about each study could be 
transcribed. A single study might fill several coding sheets, depending on how 
many different class sizes were compared in pa:rs, how many different achievement 
tests were reported, whether data were reported separately for different ages or; 
IQs, and so forth. 

The major items of the coding sheet are reported below: , 
.IDENTIFICATION: 

1) Year . This item was included to check on whether there is a time 
trend in the class-size and achievement relationship. 

2) Source of Data . Whether f i om a journal, book, thesis, or unpublished 
source. 

INSTRUCTION: . ' 

3) Subject . The subject taught (reading, math , etc. ) was recorded.. 

4) Duration of Instruction . The amount of teaching was recorded in 
hours and in weeks. 

.5) No. of Pupils . The numbers of pupils on which the small and large- 
class achievement means were based were recorded. This number was not 
the same as the "c''ass-size" since there might be several small or 
large classes used in the study. 
6) No. of Instructional Groups . (See #5 above. ) 

•7) No. of Instructors . (See #5 above.) ,_„ " . 

8) Pupil /Instructor Ratio . This measure is the measure of class-size . 

One teacher with a group of 30 counts as a P/I ratio of 30; two teachers 

in a class of 30 gives a P/I of 15. 



CLASSROOM DEMOGRAPHICS: 

9) Pupil Ability . Average IQ of the pupils was estimated when not 

reported; three broad categories were used: IQ _< 90; 90 < IQ < 110; 
IQ > 110. , . 

10) Ages and Average Age . These two variables permitted discriminating 
instances in which all pupils were of one age from studies in which 
pupils of several ages were represented and the average age was used 
to describe their level since data were not reported separately. This 
variable was used to distinguish data from elementary and secondary 
school levels. 

STUDY CONDITIONS-: 

11) Assignment of Pupils and Teachers to Groups . The assignment of pupils 
V and teachers to classes of different sizes was described»as either 

"random," "matched, "^"repeated measures," or "uncontrolled." These 
variables were important in describing the degree of experimental^ con- 
trol exercised in the study. "Random" is..obvious; "matched" refers to 
attempts to equate small and large classes by other than random means 
on pretests of achievement or ability; "repeated measures" refers to 
using either the same pupils or teacher in both small and large 
classes, e.c,. , 10 pupils might be taught alone and then in a group of 
40 and their achievement compa>"ed; "uncontrolled" should be obvious. 
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OUTCOME VARIABLE: 

2) Type of Achievement Measure . Outcomes were measured by standardized 
/ achievement tests, specially designed (ad hoc) tests, or teachers' 
/ assessments of achievement. The latte/- two categories were grouped. 

J 13.) Quantification of Outcomes . In some instances, a degree of axperi- 
mental control could be attained by expressing achievement as gains , 
fi^om pretest to posttest or covariance adjusting, pos.ttest means for, 
pretest differences. If this was done, it was noted. ' , 

Quantifying Outcomes 

A simple statistic is desired that describes the relationship between class- 
size and achievement as determined by a study. No matter how many class-sizes 
are compared, the data can be reduced to spme- number of- paired comparisons , a 
smaller cUss against a larger class. Certain differences in the findings "must 
be attended to if the findings are later to be integrated. Tiie most obvious 
differences involve the actual sizes of "smaller" and "larger" classes and'the 
scale properties of the achievement measure. The actual ..class-sizes compared 
must be preserved and become an essential part of the descriptive measure. The 
measurement scale properties can be handled by standardizing all mean differences 
in achievement by dividing by the within group .standard deviation (a method that 
is complete and discards no information at all under the assumption of normal 
distributions).^ The eventual measure of relationship seems straightforward and 
unobjectionable: 
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where: ^ 

% is the estimated mean achievement of the sma Tie r class which contains; 
pupils; 

is the estimated mean achievement of the larger class which contains 
L pupils; and 

a is the estimated within-cTass standard deviation, assumed to be 
homogeneous across the two classes. 
\ As a first approximation to^studying the class-size and achievement rela- 
tionship, it is considered irrelevant that the particular types of achievement 

^\ 

that lie behind the variable are quite different knowledges and skills^ measured 

\ 

in quite different ways. ^' - ] 

If distributional assumptions about are needed to add meaning to particu- 
lar values Gv A^^^, normality will be assumed. For example, suppose A^^^ 
Then assuming normal uistributions within classes, the average pupil in the 
smaller class ^scores at the 84th percentile of the larger class. These interpre- 
tations are occasionally helpful, but seldom critical j and our investment in the 
normality assumption is not great. It would be no surprise nor any concern if 
the assumption proved to be more'or less wrong, and it's probably not far gff in 
most instances. 

Calculating A^^j^ ^ 

Reports of research frequently omit such basic descriptive measures as means 
and standard deviations. This omission frequently complicates the ca'iculation of 
Aq . 5 but seldom obviates it. Transformations of commonly reported statistics 
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(t, F, etc.) into A's can be derived (Glass, 1973). A special problem in calcu- 
^latiori of concerns studies in which class-size is correlated with achieve- 
ment across many classrooms (e.g-. , Coleman, 1966; Robinson, 1963). In these 
instances, was calculated as follows. The distribution of class-sizes was 
determined by assuming normality and notiltg the mean and standard deviation.". The 
regression coefficient was calculated, for the regression of achievement (assumed 
. to be calculated on a unit-normal scale) onto class-size vi£ 3.= r. rJ^r^- 
Then the class-sizes at the 25th and 75th percentiles, assuming normality, were 
determined. These beca.-ne the "smaller" and "IcTger" classes. Finally, the 
achievement in these classe\was determined yja^ fne formula 3(X - I ) where X is 
"class-size.". The value of A2_j^ is then .readily calcul ated . Some studies 
involved only a dichotonious achievement measure (e.g., "promoted (to the next' 
grade) vs. not promoted"). Proportions thus derived were transformed into metric 
information and then into values of A2_j^ by means.of the probit transformation 
(see Glass,, 1978). - 

Describing the Class-Size and Achievement Relationship 

There exist several alternative statistical techniques for integrating a . 
large set of A2_l's so as to describe the aggregated findings on the class-siize 
and achievement relationship. A large, square matrix could be constructed in 
which the rows and columns are class-sizes and the cell entries are average 
• values of nearly equal values of average deltas could be connected by lines 
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to form "iso-del tas" in much the manner as economic equilibrium curves are used 
to depict three variable relationships. Or a variation of psychometric scaling 
could be employed: a square matrix of class-stzes could be constructed for 
which each cell entry would be the proportion of times the row class-size gave 
achievement greater than the column class-size. This matrix could be scaled by 
means of Thurstone's Law of Comparative Judgment, which would locate the class- 
sizes along an achievement continuum. (This method was used and the results were 
reasonably satisfactory; but they ad^ little to findings obtained by more direct 
means. that are- reported hore.) Finally, regression equations could be con- 
structed in which ^ is partitioned into a weighted linear combination of 
•and L and functions thereof and error. There is much to recommend this latter 
procedure, and the technique eventually employed is a variation of it. But the 
rfegre^sion of |^ onto only and requires three dimensions to be depicted. 
Anything more complex than a simple two-dimensional curve relating achievement to 
the size of class was considered undesirably complicated and beyond the easy 
reach of most audiences Who hold a stake in the results. 

The desire to depict the aggregate relationship as a single line cuirve is 
confopnded with the problem of essentjal inconsistencies in the design and 
results of ^the» varidus studies. A single study of class-size and achievement may 
yield , several v'a^'Mes of A^ In fact, » if k different class-sizes are compared 
on a single achievement test, k(k-l)/2 values of A^^^^ will result. This set of 
A's from a single study will form a consistent set of values in that they can be 
joined to form a single connected graph depicting the curve of achievement as a 
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function of class-size. However, various values of ^ arising from .different 
studies can show confusing inconsistencies. For example, suppose that Study #1 
gave Aj0_j5, and Aj5_20' ^^^^^ *2 gave Aj5_3q, Ajg,^^, and A3q_4q. 

A few moments reflection will reveal that there is no obvious or simple way to 
connect these values into a single connected curve. 

The eventual solution to these problems proceeded as follows: A^ ^ was 
regressed onto a quadratic function of S and 1 by means of the least-squares cri- 
terion; then that set of values of A that could be expressed as a single, con- 
nected curve was found. 

The regression model selected accounted for variation in A^ ^ by means of S . 
S^^ and I, Obviously, something more than a simple linear function of and L. 
was needed, otherwise a unit increase in class-size would have a constant effect 
regardless of the starting class-size S\ and the S^^ term seemed as capable of 
filling the need as any other. The size differential between the larger and 
smaller class, L-S, was used in place of L for convenience. Thus, the ^ 
values were used to fit the following model: 

^S-L " ^0 ^1^ ^ ^2^' * 33(1-5) + e (1) 
Fitting this model by leas,t-squares will result in the curved regression surface 
H-l " ^0 ^ ^1^ + 3252 + 33(1-5) (2) 
, The problem now is to find the set of, A's in this surface that can be 
depicted as a single curved-»line re'lationship in a plane. The property that must 
hold for a set of A's before they can be depicted as a connected graph in a plane 
is what nrjght'be called the consistency property : 

A . + A = A ' 

"r"2 "2""3 "r"3 

for nj^ < n2 < n^. If this property is not satisfied, then one is in the strange 
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situation of claiming that the differential achievement between class-sizes 10' 
and 20 is not the sum of the differential achievement from 10 to 15 and then from 
15 to 20. 

When the consistency property is imposed on (2), it follows that: 
Bq + B^nj + 62nJ + B3(n2-nj) + B^ + Bjn2 + B2n2 + B3(n3-n2) 

. = Bq + BjHj + B2ni + ^3(03-11^) (3) 

» 

Simple algebraic reduction of (3) produces the following:* 

/S. A. " A, 

Bq + Bin2 + B2n2 =0 

The two solutions to the quadratic equation in (4) are points n2 such that 
if ^ is measured with n2 as either the larger, U or smaller, S, class-size, 
then the resulting set- of A's will lie on the four dimensional regression curve 
in (2) but can be depicted as a single line curve in a plane. Since n2 becomes 
the point around which values of n^ and n^ are selected, it will be called the 
pivot point . That there are two solutions for n2 is perplexing, fortunately, in . 
the analyses to be reported the two corresponding curves were virtually parallel 
in practice. 

A single line curve in a plane can.be constructed by solving for one or the 

A. 

Other val lies' of n2 in (4) and constructing, a set of A values. These values will 
give the standardized mean differences in achievement between n2. and any other 

• . ■ A 

class-siz€o The curve that connects these A's has no non-arbitrary starting 
point. One can assume for convenience sake that the achievement" curve (z^) , 
instead of the (differential achievement curve (A), is centered around an arbitrary 
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class-size, e.g., something like the national average in the low 20's. Finally, 
for descriptive purposes, the metric of percentile ranks was chosen over the 
metric of z-scores; thus the curve z was transfonhed into a curve of percentile 
ranks by assuming a normal distribution of achievement. 

Comment on Statistical Inference 

In the analyses that follow, ordinary matters, of statistical inference have 
been ignored. The application of usual interval estimation procedures or 
statistical tests makes little sense for two reasons. The data base is laced 
with a complicated structure of interdependent observations; several comparisons 
arise from a single study when more than two class-sizes are compared, and there 
is no sensible way to reduce each study to one observation. Even if a study 
involves comparing only two class-sizes, there might have been compar-'sons of 
reading and math achievement. It makes far less sense to average these than to 
let each be separately entered in the data base. The data bases of most meta- 
analyses are complex nested and multi -level arrangements. The methods of analyr- 
ing them fully await a full explication; methodological work on these problems 
has been launched in promising directions (Burstein, 1978). Secondly, randomiza- 
tion is absent from the data set in any form that would make probabilistic models 
based on it applicable. To the extent that one might care to infer to popula- 
tions of pupils, the sample size is so large that significance tests would be an 
empty pro form ritual . To the extent one might wish to infer to populations of 
studies, it must be recognized that the studies included have in no way. been . 
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sampled from any conceivable population. Error and instability of various odd 
sorts, exist in the data set; how they should be dealt with is not at all appar- 
ent. 

Findings 

.the report of findings falls into two broad categories: (1) description of 
the data base and (2) regression analyses relating achievement and class-size. 

Descriptipi/ of the Data Base - 

In all, 77 different studies were read, coded, and analyzed. These studies 
yielded a total of 725 A*s. The comparisons are based on data from a total of 
nearly 900,0,00 pupils spanning 70 years research in more than a dozen countries- 
(The eritire set of data is reproduced in the appendix to this report-) 
. ' ' \^ ' ■ 

i The total body of evi)]enc£ can.be described partly in quantitative terms 
thro(ugh use of frequency distrlbut Jons of characteristics of the studies. These 
tabulations will be presented In tarms of A's rather than studies. The descrip- 
tive data do not only coimiunicate an understanding of the evidence upon which the 
conclusions rest; they point to the relatively over-studied and under-studied 
aspects of the topic and can help guide future research on class-size and 
achievement. , 

In Table 1 appears the frequency, distribution of A's by year in which the 
study appeared . . It is. clear from Table 1 that class-size research was an active 
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Table 1 

Class-Size Comparisons (A) by Year of Study 



Cumulative 



Year 


No. Of A's 


% 


% 


1900-1909 


22 


3.0% 


3.0% 


1910-1919 


184 


25 'A% 


28.4% 


1920-1929 


138 


19.0% 


47.4%, 


1930-1939 


47 


6.5% 


53.9% 


1940-1949 


1 ■," 


0.0% 


53.9% 


1950-1959 


62 


8.6% 


62.5% 


1960-1969 


150 


20'. 8% 


83.3% 


1970-1979 


121 


16.7% 


100.0% 




725 


100.0% 





early topic in educational research, was largely abandoned for 30 years after* 
1930, and has been resurrected in the last 15 years- 

In Table 2 appear data on the publication source from whic'/i the comparisons 
were drawn. Although published journal articles are the major source of data, 
about 20% of the data were found in theses and unpublished reports — both of 
which -have not been well covered In previous reviews. 

In Table 3 appear the frequencies of comparisons categorized by the school 

subject taught in the study. Nearly half of the comparison came from studies in 

which elementary school pupils were taught all subjects . in classes of varying 

sizes. There is surprisingly 1 ittle work on reading alone; however, the 342 "all 

subjects combined" comparisons typically include reading as an important element. 

In Table 4 are reported the numbers -of hours of instruction given in the 
..." % 
classes being compared. The range is enormous, from a single hour for a very / 

small scale tutoring study, to 9,000 hours, representing five years of elementary 
school instruction. The "hours of instruction" distribution shows three modes: 
50, 180, and 900 hours. These times correspond to a three credit-hour semester- 
long course, a five credit-hour year-long courses and a year of teaching five 
hours per day. The literature does not lack studies conducted over significant 
intervals of time. The average duration is 5,36 hours With a standard deviation 
of 1033 hours and a skewness of 5.58. 

In Table 5 appears the distributionof comparisons for various ages of 
pupils. Research is spread fairTy evenly across'the elementary and secondary 
grades. The first four. years of school are only slightly underrepresented. The 
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Table ? 
Class-Size Comparisons (A) 
by Publication Source 



Source 


No. of A's 


% .. 


Joumial 


474 


65.4% 


Book 


114 


15.7% 


Thesis 


60 


' 8.3% 


Unpublished 


77 


10.6% 




725 


100.0% 
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Table 3 
Class-Size Comparisons (A) 
by Subject of Instruction 



Subject Taught No. of A's % 

All Subjects Combined 343 47.2% 
(i.e., elementary school classes) 

Reading 39 5,4% 

Mathematics 84 11.6% 

Language . ' , 144 19.9% 

Psychology 23 3.2% 

Natural /Physical Sciences 28 3.9% 

Social Sciences and History 40 5.5% 

All Others _25 3.4% 

725 100.0% 
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Table 4 

Class-Size Cpmparisons (A) by Hours of Instruction 



Hours Instruction No. of A's % Cumulative Percent 



1- 10 


26 


, - 3.6% 


4.5% 


"11^20 


40 


5.5% 


11.4% 


21- 40 


40 


5.i% 


18.4% 


41- 60 


'50 


6.9% 


27.0% 


; 61-100 


30 . 


4.1% 


32.2% 


' 101-150 


23 ' 


3.2% 


36.2% 


151-200 


126 


17.4% 


58.1% 


201-300 


17 


2 • 3% 


61.0% 


301-400 


3 


0.4% 


61.5% 


401-500 


30 


4.1% 


66.7% 


501-800 


37 


5.1% 


73.1% 


801-1000 


132 


18.3% 


96.0% 


3600 


18 


2.6% 


99.1% 


9000 


5. 


0.8% 


,r 10CvO% 


Unknown 


148 - 
725 


20.4% 
100.0% 
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Table 5 

Class-Size Comparisons (A) by Age of Pupils 



Age 


No. of A's 




Cumulative Percent 


5- 6 


56 


7.7% 


7.7% 


7-8 




7.6% 


15.3% 


9-10 


198 


27.3% 


42.6% 


11-12 


98 


13.5% 


56.1% 


13-14 


81 


11.1% 


67.2% 


15-16 


109 


15.0% 


82.2% 


17-18 


108 


14.9% 


97.1% 


19 & older 


20 


2.8% 


100.0% 




725 


100.0% 
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average age represented in the 725. comparisons is 12.3 years with a standard 
deviation of 4.0 years. 

The next few items of information concern the experimental validity of the 
comparisons, i.e., the incidence of various experimental controls and ex post 
facto adjustments. In Table 6, the comparisons are tabu'lated by the type of 
assignment of pupils to the different size classes. The type of assignment 
^ labeled "repeated measures" refers to the use of the same group of pupils in both 

a small and a large class and the comparison of their achievement in the two 

/ 

classes. Each of the first three types of assignment represents reasonably good 
attempts at eliminating gross inadequacies in design; these three conditions 
account for slightly more than half of all the comparisons. Even though half of 
the comparisons involved comparing naturally constituted and non-equivelent large 
and 'small classes, some of them were based on ex post facto statistical adjust- 
ments for pre-existing differences. So the data are not half worthless; indeed 
whether the. experimental inadequacies are important mediators of findings is an 
empirical fact— rather than an a priori judgment ~ which "will be examined in 
detail later in this report. 

Many studies attempted to control for the initial non-equivalence of small 
and large classes by correcting the achievement dependent variable, either by 
calculating simple gain-scores or by covariance adjusting means. We hasten to 
point out that an uncorrected dependent variable does not necessarily indicate a 
comparison of poor quality. Correction's might be quite irrelevant in a study 
that matched or randomly assigned pupils to classes. 
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Table 6 

Class-Comparisons (A) by Assignment of Pupils 
to the Small and Large Classes 



Type of Assignment No. of A's % 
Random 
Matched 

"Repeated Measures" 
Uncontrolled 



110 15.2% 

235 32.4% 

18 2.5% 

362 49.9% 

725. mm 
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Finally, the comparisons can be described by whether achievement was mea- 
sured with a "standardized test" (i.e., a published test for a national market) 
or an ad hoc instrument designed specifically to measure achievement in the 
imrf^<l1;ate. context of the instruction given (see Table 7). 

v^^^fS^^r"?^'^^^ ^ appears the joint distribution of smaller and larger class-sizes 
on ^ip^^ 725 A's are based. For example, six A'..s derive from comparisons of 
group sizes 1 and 3. The table contains only' 550 entries instead of^725, since 
comparisons would not be recorded in this tabulation if S and were contained 
within the same broad category (e.g., if S = 18 and L = 22j. Such comparisons 
were incorporated in all subsequent analyses, but the need to keep Table 8 "down- 
to a reasonable size precluded the classification of all 725 A's. It is apparent 
in Table & which size comparisons have been relatively overstudied and which have' 
been neglected. The dearth of comparisons of instructional group sizes in the 
range from 2 to 10 pupils is particularly apparent. 

Regress i on Ana 1 ys es 

' , • <• » - 

The depenrlent variable, A^^^, in the regression analyses had the following 

statistical proper'ties: 

Properties of Distribution of A^^l 
a) N = 725. 

hf Mean = .088; Median = .050. 
/ c) 40% of the Ag^^^ were negative; 60%, positive. 

- . d) standard deviation = 0.401. 

e) Range: -1.98 to 2.54. 
- f) Skewness = 1.151; Kurtosis = 7.461 
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Table 7 

Class-Comparisons (A) by Type of Achievement Measure 



T ype of Achievement Measure No. of A's % 

Standardized ist 318 43.9% 

Ad Hoc Measure 407 56.1% 

725 100.0% 



ERIC 



31 



N 

U) 
I 

U) 

O 

OL) 



Table 8 

Joint Distribution of Smaller and 
Larger CI ass -sizes in the Comparisons 



1 

•2 
3 

4- 5 
6-10 
11-16 



I 17-23 

24-34 
>35 





Larger CI ass- 


size 






2 3 


4-5 6-10 


11-16 


17-23 


24-34 


.^35 


1 6 


1 3 


7 


1 


34 


0 


- 0 


1 0 


0 


1 


0 ; 


0 




0 0 


'0 


0 


6 ' 


0 




0 


0 


1 


2 


0 






8 


0 


5 


2 








19 


44 


27 










78 


106 




* 








197 
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On the average, the 725 ^^^^'^ were positive, i.e., over all comparisons , 
available — regardless of the class sizes compared — the results favored the . 
smaller class by about a tenth of a standard deviation in achievement. This 
finding is not too interesting, however, since it disregards the sizes of the 
classes being compared. One interesting feature of the A's is that only 60%' of 
them are positive, i.e.., favor the smaller class in achievement. This is so, 
even though every effort was made in compiling the data base to include studies 
spanning the full range of class-sizes from individual tutorials to huge lectures 
One suspects that the odds of observing a positive in the typical class-size 
range so often studied (15 to 40, say) are even smaller, perhaps as low as 55% 
to 45%. 

In these rough estimates, one of the fundamental problems is, revealed that 
has made the class-size literature so difficult for reviewers. If the relation- 
ship one seeks has only 55 to 45 odds of appearing and one looks for it without 
all the tools of statistical analyses that can be mustered, the chances of 
finding it are small. One need not, wonder why narrative reviews of a dozen or 
two istudies produced little but confusion. 

To make sense of the cl^ss-size and achievement relationship, one must 
account for the magnitude of the A's and their variance in terms of the actual 
sizes of the smaller and larger classes. These are the purposes of the. regres- 
sion analyses. In the remainder of this section, such regression analyses are 
rsported for the entire data set and for the data set stratified on several 
important characteristics of the studies (e.g., age of pupils, validity of the 
study). 
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1. Regression Analysis for Entire Data Set. 

The model A5_L = Bq + ^1^ ^ ^2^' ^aCL-S) + e was fit by least^squares for 
the 725 points. The results were as follows: 

• Variables Mean St. Dev. 



Independent: 

S, size of smaller class 23.243 11.463 

671.446 603.463 

L-S, difference between large & small class 19.906 20.671 

Dependent: 0.088 0.401 

Correlations 

S Si L-S A 

S 1 .932 , .004 -.271 

S2 1 .011 -.135 

L-S 1 -047 

Regression Analysis 
Multiple R = .426 
Source of Variation df Mi 
Regression 3 6.684 ; 

Residual 721 .132 ' 

6o " -S^O^^ 6 J = -.03860 = -00059 $3 = .00082 

The regression equation for estimating |_ is 

A5_|_ = .57072 - .03860S + .00059S2 + .00082 (L-S) 
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Base,d on the entire data set, the following table of standardized compari- 
sons for selected class-sizes can be constructed: 

Standardized 
Differential < 

Small Class Size Large Class Size Achievement, A2_l 

1 '40 .565 

10 40 .268 

■ 20 40 .051 

30 40 -.048 

1 25 - .552 

5 ' 25 .409 

, ..• 10 25 .256 ' 

15 25 .133 

20 25 .039 



These data shciiw that the difference in achievement between class-size 1, 
i.e., individual instruction, and class-size 40 is more than one-half standard 
deviation. The difference between class-size 20 and class-size 40 is only about 
five hundredths standard deviation. Class-size differences at the low end of the 
scale have quite important effects on achievement; differences at the high end 
have little effect. 

The curbed regression surface, can be reduced to a single line curve in a 
plane by imposing the consistency condition and solving for the pivot points. 
The two pivot points are the solutions to 

.57072 - .0386d{P) + .00059(P=') = 0. 
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In this instance, the pivot points equal approximately 43 and 23. The lower 
value, 23, was selected as the pivot point around which to construct the con- 
nected curye; the choice was arbitrary and calculations not reported here 
revealed it to be largely inmaterial . The values of and Ap_^ are as follows 
for P = 23: ■ 



'0 



H-23 " 


.551 


^2-23 " 


.513 


^5-23 . " 


.407 


^10-23 


.254 


^20-23 " 


.037 


^23-30 " 


.001 


^23-40 ^• 


.009 



Hence, on this curve the difference between achievement in class-sizes 1 and 
40 is .551 + .009 = .560. The curve is presented in Figure 1.. The ordinate is 
represented by a standard score metric; the zero point is arbitrarily fixed at a 
class-size of 30. , " 

In Figure 2, the curve in Figure 1 is translated into -a metric of percentile 
ranks on the ordinate by assuming a normal distribution of achievement. There it 
can be seen that the iifference in average performance from class-size 1 to 
class-size 40 is from above the 70th percentile to just below thfe 50th. There is 
nea>ly a ten percentile rank difference between instructional groups of sizes 10 ' 
and 20 pupils. . 
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f'We 1. Consistent regression line for achleveme^^^ 
. ' units) onto class-size. ' ' . 
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Fioure 2. Consistent regression line for achievement (percentile ranks) 
onto class -size (all, data). ' . 
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2. Regression Analyses for Sub-sections of the Data. 

Regression analyses were performed for many smaller portions of the entire 
data set in an attempt to determine which characteristics of the studies might . 
mediate the size of the class-size and achievement relationship. More than a 
dozen factors were employed in splitting the data base: year of study, subject 
taught, age of pupils, IQ, type of test, etc. Few'of these characteristics were 
systematically related to the strength of the class-size and achievement correla- 
tion. Among those factors of discrimination that produced virtually identical 
regression lines were "source of data," "subject taught," "duration of instruc- 
tion," "pupil IQ," and "type of achievement measure." From among these few 
Characteristics that appeared to interact with the relationship, three stand ou.t 
as particularly interesting: year of the study, level of. schooling (elementary 
. vs; secondary), and internal validity of. the bcudy. The complete regression 
analyses will be reported below for the latter two characteristics. Details of 
the "year of study" analyses will not be reported here; suffice it to note that 
there is no correlation between class-size and achievement in those studies 
carried out before 1940 and a strong relationship favoring smaller classes in 
post-1960 studies. The two eras differ in many respects, most notably in 

terms of the sophistication of both experimental design and measurement. 

Elementary vs. Secondary . The curvilinear regression model in (2) was fit 
separately for pupils of age 11 years or younger (elementary) and 12 years or 
older (secondary). 'The summary statistics and solutions are as follows: 
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Elementar- (N = 342) Secondary (N = 349) 



Variables 




Mean 


St. Dev. 


Variables 


Mean 


St. Dev. 


iiiuepena6n t \ 


c 
o 


^^.o3d 


11 -left 

11 •758 


Independent: S 


23.642 


11.168 






659.345 


ccc yen 




683.304 


646.598 




L-S 


13.915 


o • Jll 




25.777 


26.641 


Dependent: 


A 


. 0.092 


i- 


Dependent: A , 


0.085 


0.504 


Correlations 




Correlations 






Si 


L-S 


A 


S Sf^ 


L-S 


A 


S 1 


.951 


-;377 


-•343^ 


S 1 .924 


.112 


-.259 


S2 


1 


-.345 


-.215 


s^- 1 


.098 


-.106 


L-S 




1 


.241 


L-S 


1 


.024 




f 


Regression Analysis - 


Elementary Grades 












Multiple R 


.= .505 






■ 




Source of Variation 


. df MS 










Regression 


3- 1.898 










Residual 


338 .049 






, ^0 


= .38503 3j 


= -.02995 


32 = .00052 33 = . 


00344 






^S-L 


= .38503 


- .02995S + ; 


;00052S2 + .00344 (L-S) 


- 






Regression Analysis 


Secondary Grades 












Multiple 


= .439 . - 










Source of Variation 


df MS 










.\ Regression 


3 5.667 










Residual 


' 345 0.207 






^0 


= .75539 3j 


= -.05024 


32 = .00071 33 - . 


00111 






^S-L 


= .75539 


- .05024S + . 


00071S2 + .OOlll(L-S) 
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Some particularly interesting values of A on the two regression surfaces are 
1 isted below: 

. ' - A, Standardized 

Differential Achievement 
. Smaller Class Size Larger Class Size Elementary Secondary 

1 40 . .490 . .749 

10 40 .241' .357 

20 40 .063 .057 

• 30 , 40 -.011 -.102 

1 10 .387 .716 

3 10 .324 .619 , 

5 10 .265. .527 

The class-size and ^achievement relationship seems consistently stronger in 
the secondary grades than in the elementary grades. This interaction is also 
seen in Figure 3 where the. consistent curves are drciwn around pivot points of 19 
for elementary and 22 for secondary. The ordinate scale in Figure 3 is percen- 
tile ranks. 

Well -Control led vs. Poorly-Controlled Studies . The comparisons were dis- 
tinguished on the basis of degree of experimental control exercised in the stnriv: 
Although many fea-t-jres of experimental control could have been noted and anal- 
yzed, the method of ass.ignment of pupils to classes of different sizes proved. to 



studies in v;hich pupils- 
over three hundred com-. 



be the most important. Over one hundred A's came from 
were assigned at random to larger and smaller classes; 
parisons were "uncontrolled," i.e., naturally constituted larger and smaller 
^classes were compared. The summary statistics and. solutions of the regression 

■ ■ ■ - ^ 

models are as follows: 
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Poorly-Controlled (N = 334) 



Well -Control led (N = 108) 
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The pivot points for the consistent regression curves are 17 and 48 for the 
poorly-controlled studies and 17 and 32 for the well -control led studies. These 
curves calculated around class-size 17 appear in Figure 4 where the ordinate is 
expressed in percentile ranks. 

The curves in Figure 4 show large differences in the class-size and thieve- 
ment relationship depending on whether pupil assignment was random or uncon- , 
trolled. This finding contrasts sharply with similar analyses of the association 
between experimental design quality and effects in the field of psychotherapy 
(Smith and Glass, 1977). The difference is probably due to the magnitude of the 
effects that are the object of the research in the two fields. The typical 
psychotherapy effect (therapy vs. control group) is between three-quarters and a 
full standard deviation (Smith, Glass and Miller, 1979); the typicaf class-size 
study was seeking to establish an effect of less than one-tenth standard devia- 
tion. It is little surprise, then, that in one field experimental design quality 
proves critical , and in another field it does not. ' 

In an area of research where the quality of methodology interacts with the 
findings of studies, the results of the best designed studies should be given 
more weight in drawing conclusions. The curve for the well -control led studies in 
Figure 4, then, is probably the best representation of the class-size and ^ 
achievement relationship. 

Concern was expressed by several persons who examined the preliminary 
analyses that the curve for the well-controlled studies in Figure 4 might depend 
excessively on the twenty or thirty comparisons of very small class-sizes (one 
and two up to five, say) in the data base. When all those comparisons for which 
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S = 1, were removed, the curve in Figure 4 fpr- well -control! ed studies was even 
steeper than. that shown; this finding is contrary to the claim that tutoring 
studies skewed the curve unnaturally. When all comparisons for which S was less 
than 6 were removed, the curve for well -control led studies became less steep; 
however, it still rose from the 50th percentile at size 40 to the 60th at size 
10, th6 67th at size 5 and the 74th at size 1. 

Conclusions 

Research on class-size and achievement is a particularly complex body of 

findings to integrate and understand. The integration of this literature has 

required more Sophisticated analysis than has previously been applied to the 

problem. The meta-analysis of the research reported here has drawn heavily on 
' ■- ' ' ' ■ , ■' ■ 

precise quantitative description and analysis. A clear and strong relationship' 

between class-size and achievement has emerged. The relationship s6ems slightly 

stronger at the secondary grades than the elementary grades; but it does not 

differ appreciably across different school subjects, levels. of pupil IQ, or ^ 

several . other obvious demographic features of classrooms. The relationship is 

seen most clearly in well-controlled studies in which pupils were randomly 

assigned to classes of different sizes. Taking all findings of this meta-analysi 

into account, it is safe to say that between class-sizes of 40 pupils and one 

pupil lie more than 30 percentile ranks of achievement. The difference in 



60 



r 



•^ifijvar^nt resulting from instruction in groups of 20 pupils and groups of 10 can 
; iarger than 10 percentile ranks in the central regions of the distribution. 
{6r8 is little doubt that, dther things equal, more is learned in smaller 
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APPENDIX 
DATA LISTING 



The rav/ data- on which the analyses are based are listed on the 
follov/ing pages. The key to decoding the variables appears in 
Table 3.1 the section of the report on Methods. Horizontal line's 
separate the studies on the first page of the listing only. The 
variables are numbered on the first page of the data listing. The 
Titles of the variables corresponding to these numbers are as follows 

r. ID# 

2. Year 

3. Source 

4. Subject taught 

' 5. Hours of instruction 
'6. Weeks of instruction 

7. N for small classes 

8. No. of teachers for small classes 

9. Class-size (P/I) for small 

10. Accuracy of P/I 

11. N for large classes 

12. No. of teachers for large classes 

13. Class-size (P/I ) for large 
14., Accuracy of P/I 

15. IQ 
.16. Age 

17. Assignment of pupils 

18. Assignment of teachers 

19. Type of achievement measure 

20. Subject of achievement measure 

21. Quantification of outcomes 

22. Congruence of instruction and achievment measure 

23. pelta( S-L ) 

24. No. of times S greater than L 

25. No. of times L greater than S 
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